Application of microarray analysis on computer cluster and cloud platforms.
نویسندگان
چکیده
BACKGROUND Analysis of recent high-dimensional biological data tends to be computationally intensive as many common approaches such as resampling or permutation tests require the basic statistical analysis to be repeated many times. A crucial advantage of these methods is that they can be easily parallelized due to the computational independence of the resampling or permutation iterations, which has induced many statistics departments to establish their own computer clusters. An alternative is to rent computing resources in the cloud, e.g. at Amazon Web Services. OBJECTIVES In this article we analyze whether a selection of statistical projects, recently implemented at our department, can be efficiently realized on these cloud resources. Moreover, we illustrate an opportunity to combine computer cluster and cloud resources. METHODS In order to compare the efficiency of computer cluster and cloud implementations and their respective parallelizations we use microarray analysis procedures and compare their runtimes on the different platforms. RESULTS Amazon Web Services provide various instance types which meet the particular needs of the different statistical projects we analyzed in this paper. Moreover, the network capacity is sufficient and the parallelization is comparable in efficiency to standard computer cluster implementations. CONCLUSION Our results suggest that many statistical projects can be efficiently realized on cloud resources. It is important to mention, however, that workflows can change substantially as a result of a shift from computer cluster to cloud computing.
منابع مشابه
Modification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملBridging Microarray Platforms To Extend the Utility of Gene Expression Profiles
Gene expression profiling experiments provide a wealth of data for use in the analysis of the transcriptome and have yielded results that have helped in the classification of disease type and suggestions of possible biological pathways. Two major types of platforms exist, cDNA platforms where ESTs are spotted onto a glass microscope slide and oligonucleotide platforms where each gene is represe...
متن کاملANAIS: Analysis of NimbleGen Arrays Interface
UNLABELLED ANAIS is a user-friendly web-based tool for the processing of NimbleGen expression data. The interface reads single-channel microarray files generated by NimbleGen platforms and produces easily interpretable graphical and numerical results. It provides biologists six turnkey analysis modules-normalization, probe to gene, quality controls, differential expression, detection, queries a...
متن کاملThe Analysis for Virtualization Performance in Cluster and Cloud Computing
Virtualization Technology is an interesting research topic in current cloud computing and service. Using the Virtualization Technology in cloud or cluster computing can obtain a lot of benefits, such as ability to deploy any virtual platforms rapidly, easiness to manage all precious resources, and cost reduction. In order to discover optimal performance for virtual platforms, several well-known...
متن کاملFRA-PSO: A two-stage Resource Allocation Algorithm in Cloud Computing
Cloud computing gives a large quantity of processing possibilities and heterogeneous resources, meeting the prerequisites of numerous applications at diverse levels. Therefore, resource allocation is vital in cloud computing. Resource allocation is a technique that resources such as CPU, RAM, and disk in cloud data centers are divided among cloud users. The resource utilization, cloud service p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Methods of information in medicine
دوره 52 1 شماره
صفحات -
تاریخ انتشار 2013